Bibliometric analysis of kinship analysis from 1960 to 2023: global trends and development

Kinship analysis is a crucial aspect of forensic genetics. This study analyzed 1,222 publications on kinship analysis from 1960 to 2023 using bibliometric analysis techniques, investigating the annual publication and citation patterns, most productive countries, organizations, authors and journals, most cited documents and co-occurrence of keywords. The initial publication in this field occurred in 1960. Since 2007, there has been a significant increase in publications, with over 30 published annually except for 2010. China had the most publications (n = 213, 17.43%), followed by the United States (n = 175, 14.32%) and Germany (n = 89, 7.28%). The United States also had the highest citation count. Sichuan University in China has the largest number of published articles. The University of Leipzig and the University of Cologne in Germany exhibit the highest total citation count and average citation, respectively. Budowle B was the most prolific author and Kayser M was the most cited author. In terms of publications, Forensic Science International- Genetics, Forensic Science International, and International Journal of Legal Medicine were the most prolific journals. Among them, Forensic Science International-Genetics boasted the highest h-index, citation count, and average citation rate. The most frequently cited publication was “Van Oven M, 2009, Hum Mutat”, with a total of 1,361 citations. The most frequent co-occurrence keyword included “DNA”, “Loci”, “Paternity testing”, “Population”, “Markers”, and “Identification”, with recent interest focusing on “Kinship analysis”, “SNP” and “Inference”. The current research is centered around microhaplotypes, forensic genetic genealogy, and massively parallel sequencing. The field advanced with new DNA analysis methods, tools, and genetic markers. Collaborative research among nations, organizations, and authors benefits idea exchange, problem-solving efficiency, and high-quality results.


Introduction
Kinship analysis involves determining the presence of a certain kinship relationship between individuals by examining the genetic markers through testing, based on the principles of heredity (Weir et al., 2006).Kinship analysis in the past focused mostly on paternity tests to confirm the father-child link.
In ancient times, various methods were used to identify kinship, but there was no scientific evidence to confirm their accuracy (Silver, 1989).The first scientific approach to paternity testing can be attributed to the identification of blood grouping (Figure 1) (Landsteiner, 1900;von Dungern and Hirschfeld, 1962), and in 1926 Austria pioneered the acceptance of forensic serology as admissible evidence in paternity testing cases (Mayr et al., 1991).The genetic indicators of the second-generation paternity testing are serum protein and erythrocyte enzyme isoenzymes to address the difficulties caused by a significant number of blood group similarities (Smithies, 1955;Hirschfeld et al., 1960;Dykes and Polesky, 1976).The discovery of highly efficient HLA-I antigen signifies the emergence of the third generation identification technology in 1958 (Dausset, 1958).In the early 1980s, the fourth generation emerged (Walker and Crisan, 1991), utilizing DNA probes to detect restriction fragment length polymorphisms (RFLPs) (Southern, 1975).In the subsequent decade, the development of polymerase chain reaction (PCR) (Saiki et al., 1988) and capillary electrophoresis (CE) has propelled short tandem repeat (STR) length polymorphism to the forefront of forensic genetics.The utilization of STR's capillary electrophoresis technique has proven to be precise, cost-effective (Baine and Hui, 2019), and streamlined, rapidly establishing itself as the predominant method worldwide.With the advancement of DNA sequencing technologies, such as microarray genotyping (Amorim and Pereira, 2005) and next-generation sequencing (NGS) (Reis-Filho, 2009), the exploration of DNA sequence polymorphisms including single nucleotide polymorphisms (SNPs) and microhaplotypes (MHs) has progressively increased.These methodologies complement STR analysis (Butler, 2007;Tam et al., 2020), but their widespread adoption has been hindered by high costs.Currently, CE-based STR analysis remains the gold standard for paternity testing in forensic DNA analysis (Butler et al., 2004).
Forensic kinship research has flourished in recent years, mainly based on the expansion of genetic markers and sample size in relevant studies (Bertoglio et al., 2020;Alterauge et al., 2021;Zou et al., 2022;Cui et al., 2023).Although CE is a convenient and cost-effective approach for detecting STR loci, STR loci high mutation rate, long amplified fragments, and limited number of loci restrict their use in complex kinship analysis.(Alterauge et al., 2021;Zhang et al., 2022;Cui et al., 2023).SNPs and insertionsdeletions (InDels) have shown more obvious advantages in complex kinship analysis (Nothnagel et al., 2010;Wu et al., 2021;Zhang et al., 2022).Zhu et al. used the MGISEQ-2000RS platform to sequence 1993 SNP loci in 119 Chinese Han individuals from eight families and found that the panel could be applied in paternity testing, full sibling testing, second-degree kinship, and first cousin kinship analyses (Cui et al., 2023).Liang et al. performed a genome-wide screen for new MH markers consisting of two or more variants (InDels or SNPs) within 220 bp and successfully developed an NGS-based 67plex MH panel to complement complex kinship analysis (Xue et al., 2023).The availability of human genetic data has significantly increased due to the commercialization of DNA testing and the public's interest in their DNA and genetic ancestry (Phillips, 2018;Glynn, 2022;Snedecor et al., 2022;Tvedebrink, 2022).This has led to the emergence of forensic genetic genealogy (FGG).In 2018, the investigation of the Golden State Killer case in the United States opened the door to the application of FGG technology and was named one of the top ten scientific breakthroughs of the year by Science, which garnered significant attention within the domain of kinship analysis (Kaiser, 2018;Phillips, 2018;Ram and Roberts, 2019;Glynn, 2022).The historical timeline on milestones in kinship analysis.The progress of genetic theory lays the foundation for kinship analysis.With the discovery of genetic markers and the progress of analysis technology, the field is developing continuously.
Kinship analysis is one of the main tasks of forensic science.The scope of kinship analysis has expanded from the conventional parent-child relationship (usually father-child relationship) to the complex kinship analysis such as full sibling, great-grandson and half-sibling (Cui et al., 2023).kinship analysis plays an important role in inheritance disputes, disaster victim identification, and criminal investigations (Bertoglio et al., 2020;Alterauge et al., 2021).The identification of blood types has evaluated the kinship analysis into the realm of science.With the emergence of PCR and CE techniques, the use of STR profiles based on DNA length polymorphisms can accurately determine genetic relationships between individuals.NGS technology has enabled the extensive use of new genetic markers such as SNP and MH, which are based on DNA sequence polymorphisms, in complex kinship analysis.In recent years, research in this field has boomed, and this paper aims to describe the general situation of kinship analysis through bibliometric analysis.This study used the Web of Science Core Collection database to conduct a bibliometrics analysis of relevant literature in the field of kinship analysis from 1960 to 2023, with the aim of identifying the most affected countries, authors, and evaluating current research directions in this field.

Database and search strategy
We performed a literature search using the Web of Science (WoS) Core Collection (Science Citation Index Expanded) Database on 23 January 2024, covering literature published from 1960 to 2023.
The search strategy is as follows.First, the term and topic paternity testing, paternity DNA testing, paternity forensic testing, kinship testing, kinship identification, kinship inference, kinship analysis, forensic genealogy were searched in the WoS Core Collection.All categories of publications were considered, and no time restrictions were placed.A total of 7,261 papers were retrieved, of which 7,246 reports were dated before 2024.Secondly, we limited the publications in the field of kinship analysis to all those indexed under the research category "Medicine Legal" or "Genetics Heredity" in the WoS database and identified 2,093 papers.Two independent investigators evaluated all documents, focusing on titles and abstracts to verify that the documents were related to kinship analysis.If necessary, the investigators read the full text to decide on inclusion.Finally, 1,222 papers were included and exported from the WoS (Figure 2).

Data analysis and visualization
Bibliometric analysis of 1,222 documents were performed using VOSviewer (version 1.6.19)software, open-source Biblioshiny (RStudio) and MS Excel.VOSviewer is a bibliometric network builder and visualization software based on publication, country, author, journal and keywords (Van Eck and Waltman, 2010).Biblioshiny has a graphical interface and a complete bibliometric and visualization methodology, which is quite useful for bibliometric analysis.
To analyze the basic trend of the articles in kinship analysis, the following indicators were selected: the annual scientific productivity, top contributing countries and organizations, top 20 productive authors, top 20 journals contributed to publications, top 20 cited articles, top 20 co-occurrence keywords, and the change of topics.The Three-field plot analysis representing author, country and source relationship was compiled using Biblioshiny.In order to better evaluate the level of researchers, we introduced h-index, g-index and m-index.The h-index, defined as the maximum value of h, where an author has published at least h papers, each being cited at least h times, is a measure of academic impact.For example, if an author has published 10 papers, and each of them has been cited at least 10 times, their h-index would be 10.The g-index supplements the h-index by taking into account the citation counts of highly cited papers, which is a measure that calculates the top g articles having at least g 2 citations.The m-index is defined as h/n, where h represents the h-index and n represents the number of years since the scientist's first published paper.The m-index considers the impact of scholars' varying ages on citation counts.

The annual trends in growth and average citations of publications
Counting the number of publications and analyzing the development trend can help predict the future direction of kinship analysis.This field has attracted great interest among researchers worldwide.The first publication in this field was published in 1960, and except for one publication in 1969, no more publications were published until 1973 (Figure 3).From 1960 to 1990, there were no more than 10 publications published each year.However, since 2007, there has been a considerable increase in the number of publications, with more than 30 publications published each year except for 2010.The year 2020 had the highest number of publications with 68 publications, followed by 64, 64, and 63 publications in 2023, 2021 and 2022.The highest mean citations were in 2005 (62.52 citations per publication), followed by 2009, 1993, with 62.15 and 59.83 citations per year respectively (Figure 3).In other periods, the mean citations remained lower than 50.

Countries and organizations that make the greatest contribution
A total of 1,222 publications were published in the field of kinship analysis, with China having the highest number of publications at 213, accounting for 17.43% (Table 1).The United States was next with 175 (14.32%) publications, followed by Germany (89, accounting for 7.28%), Japan (57, accounting for 4.66%), Brazil (53, accounting for 4.34%), Spain (48, accounting for 3.93%), The United Kingdom (47, accounting for 3.85%), Italy (37, accounting for 3.03%), Netherlands and Portugal (both 27, accounting for 2.21% respectively).The number of publications in ten countries, including Korea, Denmark, Argentina, Norway, France, Poland, Sweden, Belgium, Switzerland and Australia, ranges from 14 to 25, while other countries are less than 14.The number of publications published by China and the United States constitutes a huge part of the overall documents (388, accounting for 31.75%).Among the countries with more than 30 publications, except for China and Brazil, the rest are developed countries.Single Country Publications (SCPs) and Multiple Country Publications (MCPs) can reflect the internal cooperation of countries in kinship analysis (Figure 4).China has the highest number of SCPs, while the United States has the second highest number of SCPs.
The United States has the most citations, with 7,692 (an average of 44.00 citations per article) (Table 1).The United Kingdom followed with 3,145 citations (an average of 66.90), Germany with 2,563 citations (an average of 28.80), Netherlands with 2,528 citations (an average of 93.60), and China with 2,099 citations (an average of 9.90).In terms of average article citations, Netherlands, the United Kingdom and France rank in the top three.Using VOSviewer, we screened countries and regions with more than 10 published publications, and 30 out of 95 countries and regions met the criteria (Figure 5A).The graph illustrating the co-occurrence relations among countries presents Chronological distribution of publications and mean total citation per publication in the field of kinship analysis.Inter and Intra collaboration of various countries.MCP indicates collaboration among different countries, while SCP indicates the production of a single country.Countries were selected based on the corresponding author's country.The co-authorship network visualization map of institutions in kinship analysis were visualized using VOSviewer.The visualization depicts a network of diverse organizations.Each node signifies an individual organization, and the size of each circle corresponds to the quantity of publications.The connecting lines symbolize collaborations among organizations.The lines connecting items depict links, and the distance between two items roughly indicates their level of relatedness.Nodes of the same color signify membership in the same cluster.
University, Southern Medical University, and University of Copenhagen play significant roles as research partners for multiple institutions.

Most productive authors
The most prolific authors in terms of articles were Budowle B (40 articles, 968 citations), Gusmao L (31 articles, 624 citations), and Morling N (29 articles, 1,186 citations) (Table 3).However, the number of articles authored by individuals did not necessarily correlate with the number of citations received.For example, Kayser M who authored fewer articles (15, ranking 11th), received 2,267 citations, whereas Budowle B, with the highest number of articles (40), received only 968 citations.To address this disparity, we introduce evaluation metrics such as the h-index, g-index, and m-index.Authors with the highest h-index scores were Budowle B (h-index of 18), Morling N (h-index of 17), and Edelmann J (h-index of 16).Budowle B (g-index of 30), Morling N (g-index of 29), Chakraborty R (g-index of 24) and Gusmao L (g-index of 24) achieved the top four g-index scores.In terms of the m-index, Pinto N had the highest score (m-index of 0.667), followed by MHD (m-index of 0.643), Edelmann J (m-index of 0.593), Szibor R (m-index of 0.593).Budowle B, ranked first in the number of published articles, published the largest number of articles in 2011 (Figure 7).Gusmao L, who ranked second, began to explore kinship analysis in 2000 and made the most significant contribution in 2010.Morling N ranked third, having started their research in the field since 1993 and published the most articles in 2002, 2012.

Analysis of high-yielding journals
Table 4 provides a list of the most prominent journals that have published research articles on kinship identification.Forensic Science International-Genetics (249 articles, 5,011 citations) is the most popular journal for kinship analysis in terms of publications.International Journal of Legal Medicine (228 articles, 3,766 citations) ranks second, followed by Forensic Science International (154 articles, 3,538 citations), Journal of Forensic Sciences (93 articles, 1,576 citations), Legal Medicine (49 articles, 326 citations), and American Journal of Human Genetics (41 articles, 2,364 citations).In terms of citations, the ranking is as follows: Forensic Science International-Genetics holds the top position, followed by International Journal of Legal Medicine, Forensic Science International, American Journal of Human Genetics, and Journal of Forensic Sciences.

Analysis of number of citations
The publications were ranked by number of citations.Among the top 20 cited articles (Table 5), the most cited article is "Van Oven  M, 2009, Hum Mutat" with 1,361 citations.In this study (van Oven and Kayser, 2009), the author constructed an updated comprehensive phylogeny of global human mitochondrial DNA (mtDNA) variation, based on both coding and control region mutations."Wang JL, 2004, Genetics" ranks second with 777 citations, followed by "Liu XL, 2016, PLOS Genet" and "Queller DC, 1993, Trends Ecol Evol" with 627 and 613 citations, respectively.Only four articles have more than 500 citations, while the subsequent 15 articles all had more than 200 citations.

Trend topic
The topic trends in this field are examined by analyzing keywords using Biblioshiny (RStudio) from 2018 to 2023 (Figure 10).In the initial phase, research in this field was characterized by prominent keywords such as "Kinship analysis" (44 occurrences), "SNP" (87 occurrences), and "Forensic Genetics" (91 occurrences).Notably, among all the keywords, "Forensic Genetics" emerged with the highest frequency.In recent years, the research focus and trend have shifted towards microhaplotypes, forensic genetic genealogy and massively parallel sequencing.

Discussion
Kinship analysis plays a pivotal role in numerous fields, owing to its ability to uncover relationships among individuals and fathom profound significance within social structures, biological relationships, and historical contexts.In genetic research, kinship analysis provides invaluable insights into inherited diseases, population genetics, and evolutionary studies.By studying the genetic relatedness between individuals, scientists have discovered genetic markers for various diseases and traced the migration patterns of ancient human populations (Klein et al., 2005; Kruglyak, 2005; Chen and Nedoluzhko, 2023).Additionally, kinship inference proves pivotal in forensic genetic genealogy, not only facilitating the exploration of ancestral origins (Mateen et al., 2021) and the tracing of family trees (Willson et al., 2022), but also aiding public security in solving criminal cases (Greytak et al., 2019).For instance, genealogy websites and DNA testing services have fostered people to connect with long-lost relatives and gain a deeper understanding of their roots (Khan and Mittelman, 2018).Moreover, in legal matters, kinship analysis assists in determining legal rights, inheritance, and the resolution of disputed relationships.
Courts often rely on kinship identification to establish biological or legal relationships in disputes over inheritance or child custody.

Chronological distribution of publications
In this bibliometric analysis, we sieved through 1,222 documents related to kinship analysis from the Web of Science Core Collection database.Publications on kinship analysis have grown steadily since 2010, nearly three times by 2023.This growth is driven by the everexpanding interest in understanding human relationships and genetics.Researchers and scholars have actively contributed to the field, catalyzing a notable increase in publication volume.The availability of advanced technologies and improved research methods have also supported the rise of publications.Since the establishment of the first STR database in 1995 (Amankwaa and McCartney, 2018), an average annual publication output of roughly 20 has been consistently observed.In 2010, The 1000 Genomes Project Consortium published an article in Nature that included information on 15 million SNPs and one million InDels, served as a watershed moment (Consortium, 2010).Subsequently, more than 30 publications have been emerged annually, except 2013.In 2018, the Golden State killer was arrested through forensic genetic genealogy (Phillips, 2018).This new kinship inference method has attracted widespread attention.Since then, the publication output has surpassed 50 papers every year.In terms of citations, it is noteworthy that high citation frequencies were observed in 1993, 2005, and 2009.The elevated citation rate in 1993 can be attributed to the gradual exploration of the utility of microsatellite markers in forensic genetics, characterized by their high variability among individuals and their efficacy in discerning relatedness between samples (Queller et al., 1993).The peak citation rate in 1995 was due to a novel genetic marker called SNP, which demonstrated considerable potential for complex kinship analysis and offered advantages over STR (Sobrino et al., 2005).The significant increase in citation rates in 2009 was primarily attributed to the publication of a comprehensive phylogeny on global human mtDNA variation (van Oven and Kayser, 2009) as well as research conducted on the human Y chromosome (Goedbloed et al., 2009;King and Jobling, 2009).

The contributions of countries and organizations
We first analyze the contributions of countries, organizations, authors and journals in kinship analysis research.China and the United States have emerged as the leading contributors in this field, accounting for a substantial 388 publications, which represents over  Co-occurrence analysis of countries.scholars benefit from extensive collaborations with their international counterparts, leading to a profound impact on scholarly work.What's more, this ascendency is also attributable to the great economic and scientific research strength of the United States (Lei et al., 2019;Chen Z. et al., 2023).Despite China having the largest number of publications in kinship analysis, its average citation ranking among the top 20 countries is only 17th.The average citations are often served as a barometer of a research work's influence and value within its field.The lower average citation might be attributed to the shorter publication time of Chinese papers (Figure 5), which also implies that Chinese scholars need to prioritize the quality of their research outcomes over sheer quantity.Nevertheless, it is noteworthy that China, as a developing country, has made substantial strides in kinship analysis research with the largest number of publications, holding promising prospects for the future.Domestic collaboration trends among nations underscore the need for increased international exchange in the field of kinship analysis.International collaboration fosters a broader perspective and a thorough understanding of kinship analysis.
Sichuan University in China has the largest number of published articles.The University of Leipzig and the University of Cologne in Germany exhibit the highest total citation count and average citation, respectively, indicating that Germany's notable influence in this field.Among the top 20 institutions, seven hail from China, four from Germany, two from Sweden, and the remaining seven from developed countries.Research results are closely linked to financial investment, personnel training, research culture, and international collaboration.Developed nations possess greater resources and talent for conducting kinship inference research.Prestigious universities with rich academic achievements are more likely to garner increased support, attract superior talents, secure ample funding for scientific research, and cultivate an environment conducive to innovation and exploration.Moreover, they also enjoy more opportunities for global exchange.

The impact of authors and journals
What's more, Professor Budowle B from the University of North Texas Health Science Center in the United States published the most articles, renowned for his expertise in forensic science, specializing in the DNA identification of missing individuals in mass disasters (Budowle et al., 2005b), as well as the development and application of sequencing technology (Seo et al., 2013;Warshauer et al., 2013;Zeng et al., 2015), forensic microbiology (Schmedes et al., 2016;Schmedes et al., 2017), and genetic marker loci research (Budowle et al., 2005a;Larue et al., 2012).Meanwhile, Kayser M from Erasmus MC University Medical Center Rotterdam in Netherlands commands the highest citation count and average citation rate in his field, largely attributable to his pioneering work on the global human mtDNA variation (van Oven and Kayser, 2009).KENNETT D from University College London, who has the highest m-index and works on FGG, described to us the process by which dense SNP data are used to infer distant relationships (Kling et al., 2021).Forensic Science International-Genetics, Forensic Science International, and International Journal of Legal Medicine are the top three journals by the number of publications in this study.Furthermore, Forensic Science International-Genetics boasts the highest citation rate within this field.A bibliometric analysis of forensic genetics also found that Forensic Science International-Genetics' preeminence, with the highest number of articles and citations in forensic genetics (Stasi et al., 2023).For scholars conducting kinship analysis research and aiming to publish in high-impact journals, Forensic Science International-Genetics could be considered a favorable option.This journal provides an excellent platform for researchers in the field of kinship analysis to show their work and contribute to the advancement of forensic genetics.

Genetic markers contribute to complex kinship analysis
With the rapid development of society, kinship analysis of the parent-child relationship has been unable to meet the needs of disaster victim identification and criminal investigations.Complex kinship encompasses relationships such as grandparent-grandchild, uncle/aunt-nephew/niece, full sibling, half-sibling, and first or second cousins.The first cousins share a grandparent (2 generations) and the second cousins share a great-grandparent (3 generations).Currently, there are some genetic markers employed in forensic DNA analysis to address complex kinship analysis, such as autosomal STRs, Y-chromosomal STRs, X-chromosomal STRs, mtDNA, SNPs, InDels and MHs.
Autosomal STRs account for about 5% of the human genome, of which about 50% have genetic polymorphisms, mainly distributed in non-coding regions, and are suitable for most complex kinship analysis.However, conventional STR detection methods utilizing CE technology typically only amplify less than 50 STR loci (Martín et al., 2014;Wang et al., 2015;Song et al., 2023), and is difficult to obtain complete STR profiles for trace DNA less than 100 pg (Xu et al., 2022).This restricted range of STR loci greatly and hampered its applicability in intricate kinship analysis.To overcome this limitation, Cong et al. capitalized on the high-throughput capability of the NGS method, and enabled simultaneous sequencing of numerous genomic regions in a single reaction (Børsting and Morling, 2015).They successfully developed an NGS-STR typing system including 42 autosomal STR loci and an amelogenin marker, which showed remarkable efficacy for 2nddegree kinship analysis (Liu et al., 2020).
The sex STR markers specifically target regions on the Y chromosome for males and the X chromosome for females.Y-STR haplotype analysis as a prevalent tool for paternal kinship testing in historical cases, missing persons and disaster victim identification involving males.In contrast to autosomal STR, Y-STR profiling can trace distant relatives and circumvent the potential sharing of autosomal alleles between victim and perpetrator in sexual assault cases (Kayser, 2017).Moreover, rapidly mutating (RM) Y-STRs have been reported to be able to successfully differentiate between close and distant male relatives (Ballantyne et al., 2012;Ralf et al., 2020;Ralf et al., 2021;Wang F. et al., 2022).Due to the higher mutation rate of RM Y-STRs in comparison to standard Y-STRs, they significantly enhance the differentiation among male relatives within the same paternal lineage (Wang F. et al., 2022).Nevertheless, Y-STR haplotypes have a higher variability compared to single autosomal STR loci and therefore it is imperative for the Y-STR haplotype database to possess a larger scale than the autosomal STR allele database in order to ensure reliability (Kayser, 2017), which implies an additional investment in both time and financial resources.In complex kinship analysis, such as full-sib girls or half-sib girls, the use of X-STR is particularly important because it has a higher ability to exclude or identify than autosomal STR (Kling et al., 2015).The development of X-STR has also been shown to solve complex kinship in the case that X-chromosomal lineages can be taken under investigation (Becker et al., 2008).
Moreover, by using autosomal STRs alongside sex STRs in kinship analysis, researchers can obtain a more accurate assessment of biological relatedness.This dual-marker strategy helps overcome limitations that may arise when relying solely on either autosomal or sex-linked genetic data.For instance, while autosomal STRs offer broader coverage across all chromosomes and can be used to analyze relationships between any two individuals regardless of their gender, they may not always provide conclusive results due to factors like mutations or shared ancestry within populations (Amorim and Pereira, 2005).In a recent study on skeletons of Romanized indigenous people from the 5th to 6th century, researchers utilized autosomal STR typing and the PowerPlex Y23 kit for Y-STR typing, confirming that four skeletons were members of the same family (a father, two daughters, and a son) (Pajnič et al., 2023).MtDNA is particularly advantageous in cases with limited nuclear DNA or when confirmation of maternal lineage (van Oven and Kayser, 2009;Syndercombe Court, 2021).Its efficacy in the analysis of bones, teeth, and hair makes it a common choice for ancient DNA research and disaster victim identification triage (Kurosaki et al., 1993;Syndercombe Court, 2021).Despite its advantages, mtDNA analysis encounters difficulties when dealing with ancient or degraded samples, as contamination, amplification verification, and interference from other genomic regions can pose issues (Loreille et al., 2018;Syndercombe Court, 2021).
Compared with STR, SNPs, InDels and MHs are novel genetic markers and well suited for kinship analysis, characterized by a lower mutation rate, shorter amplicon size, high stability and the absence of a stutter peak (Børsting et al., 2012;Wei et al., 2014;de la Puente et al., 2020;Bai et al., 2022;Yu et al., 2022;Yuan et al., 2024).The determination of the number and lengths of identity by descent segments using high-density SNP or whole-genome sequence data is a fundamental principle of FGG (Huff et al., 2011;Ertürk et al., 2022).However, binary markers like SNPs and InDels are comparatively lower than that of STR in polymorphism, necessitating a greater number of binary genetic markers to achieve an equivalent information content as observed with STR markers (Amorim and Pereira, 2005;Wei et al., 2014;Yuan et al., 2024).InDels capitalize on the benefits of SNPs and STRs, as they can be analyzed through PCR-to-CE typing approach (Liu et al., 2017;Oldoni and Podini, 2019).MHs are genetic markers that are generally less than 300 bp and consist of a small cluster of closely linked SNPs (de la Puente et al., 2020;Bai et al., 2022;Yu et al., 2022).MH loci are single-copy and have multiple SNPs, providing more information per locus than a single SNP, but still with lower polymorphism compared to STR loci (Oldoni and Podini, 2019).

The development and challenges brought by technological advancement
Sequencing technologies, especially NGS and third-generation sequencing (TGS), have become a hot research topic and trend in recent years and have had a great impact on kinship analysis.NGS technology can be utilized in the identification of SNPs, InDels, mtDNA, and STRs (Ballard et al., 2020;Feng et al., 2024).Compared to traditional CE analysis, NGS offers the following advantages: 1) It can simultaneously detect a large number of STR loci and has the ability to distinguish alleles with similar lengths or digital read count (Yang et al., 2014); 2) It is suitable for samples with low DNA content or degradation (Scheible et al., 2014); 3) NGS analysis can determine the full sequence of PCR products, including the STR repeat region and flanking regions (Ballard et al., 2020), allowing more variant information to be observed (Gettings et al., 2017;Phillips et al., 2018;Davenport et al., 2023) and facilitating the differentiation of mixed DNA (Phillips et al., 2007;Devesse et al., 2018).The high-throughput, fast and low cost NGS technology has brought a new revolution in forensic science.The detection of DNA sequence polymorphism has been enhanced, and the exploration of novel genetic markers is steadily advancing.The advantages of MHs in the fields of mixture deconvolution (Oldoni et al., 2020;Tao et al., 2022;Yu et al., 2022), biogeographic ancestry inference (de la Puente et al., 2020;Tao et al., 2022), complex kinship analysis (Wen et al., 2022;Xue et al., 2023) and personal identification (Pu et al., 2017) have been extensively investigated.Based on NGS, FGG has played a significant role in identifying unknown remains (Bertoglio et al., 2020), inferring distant relatives (Kling et al., 2021), and solving cold cases (Phillips, 2018).
Since Joseph James DeAngleo was successfully identified as the prime suspect in the Golden State killer case in 2018 (Phillips, 2018), forensic genetic genealogy has garnered widespread attention.The application of FGG has been reported to generate investigative leads in unresolved cold cases, and hundreds of cases have been solved using FGG technology (Murphy, 2018;Ram et al., 2018;Glynn, 2022).FGG employs a set of high-density SNPs profiles by microarray or wholegenome sequencing (WGS) to genotype biological samples or determine relatedness (Ertürk et al., 2022).SNP profiles are provided by high-density SNP profile databases such as GEDmatch, FamilyTreeDNA and DNASolves (Glynn, 2022), with data collected via direct-to-consumer (DTC) genetic testing (Majumder et al., 2021).Traditional kinship identification is mainly based on STR markers and analyzed by identical by state (IBS) or likelihood ratio (LR).In contrast, FGG mainly relies on whole genome sequencing or highdensity chip autosomal SNP typing and is analyzed by method-ofmoment (MoM) or identical by descent (IBD) fragments (Ge and Budowle, 2021;Kling et al., 2021).FGG presents advantages over traditional kinship analysis methods.FGG can be used for kinship identification at the fifth-degree and beyond, whereas traditional kinship identification cannot (Glynn, 2022).While traditional complex kinship analysis needs to increase the number of loci, FGG only needs to be tested once through WGS.However, for relationships beyond third-generation cousins (seventh-degree relationships) or more distant, individuals may not share any IBD fragments (Edge and Coop, 2020).Moreover, in certain cases, numerous relatives may present matches, demanding substantial resources for screening (Court, 2018).Additionally, it is controversial whether the police have the right to access the data and use it in the investigation of cases (Ram and Roberts, 2019), but Sweden has reported successful use of FGG in case detection work (Tillmar et al., 2021) and the United Kingdom (Samuel and Kennett, 2020), Australia (Scudder et al., 2020) are contemplating future utilization of this technology.
The application of TGS technology in forensic genetics is burgeoning, providing a new method for real-time detection of longer markers through its single molecule sequencing and long-read techniques (Athanasopoulou et al., 2021;White and Hesselberth, 2022).Hou et al. (Wang Z. et al., 2022) employed the QNome, a nanopore genome sequencer developed by Qitan Technology, to genotype 15 MHs from 70 single-contributor samples, achieving an accuracy of 99.83%.This highlights the potential of the nanopore sequencing method in forensic analysis of MH markers.Additionally, large-scale genome projects have profound implications for kinship analysis, as they offer a vast amount of genetic data that can be harnessed by forensic experts.Through the examination of this data, forensic experts are able to identify and investigate new genetic markers relevant to forensic, thereby enhancing their capabilities in complex kinship analysis (Kureshi et al., 2020;Phillips et al., 2020;Frontanilla et al., 2022;Xue et al., 2022).

Challenges and future directions
The extraction of trace DNA and the analysis of mixed DNA remain ongoing challenges in the field (Supplementary Table S1).It has been observed that obtaining complete STR profiles becomes challenging when working with DNA samples containing less than 100 pg (Xavier et al., 2020;Xu et al., 2022).This limitation in sample size could potentially compromise the accuracy and reliability of kinship analysis.Analyzing ancient DNA approaches and suggesting systematic research on DNA extraction methods could improve the quality and quantity of DNA (Hofreiter et al., 2021).Furthermore, crime scenes frequently present investigators with mixed DNA samples, which consist of genetic material from multiple individuals.These mixed samples pose additional challenges during analysis due to a higher probability of drop-out or dropin combined with stutter peak (Gill et al., 2012;Bai et al., 2022).To overcome these hurdles, MHs have been extensively studied by scholars due to their combined advantages of STR and SNP markers, such as low mutation rate, high polymorphism, short length, and absence of stutter peaks (Kidd et al., 2013;Oldoni et al., 2019;Bai et al., 2022;Wen et al., 2022).Bai et al. developed a large panel consisting of 185 MHs to analyze degraded and/or mixed DNA samples demonstrating its utility in conducting parentage, full sibling, and second-degree relative testing, but improvements are necessary to infer more distant relatives (third-degree relatives) (Bai et al., 2022).Furthermore, the application of single cell sequencing (SCS) technology, recognized as one of the top ten scientific breakthrough technologies in 2018 along with FGG by Science (Chen L. et al., 2023), warrants attention in kinship analysis.While FGG has been a research focus in forensic genetics, SCS has received less attention despite its promising potential.SCS is widely used in developmental biology, the generation of human cell maps, and cancer research (Tirosh et al., 2016;Venteicher et al., 2017).However, it is theoretically feasible to achieve complete sequencing with only a single cell, offering potential solutions to forensic challenges arising from trace DNA (Zong et al., 2012;Diepenbroek et al., 2021).What's more, single-cell separation technology facilitates the isolation of individual cells from mixed samples, thereby eliminating the mixture and improving deconvolution (Farash et al., 2018;Diepenbroek et al., 2021).This approach, when applied to complex familial mixtures, can prevent the erroneous inclusion of non-donor relatives (Huffman and Ballantyne, 2022).Nevertheless, SCS also encounters challenges such as limited automation, reduced accuracy, and restricted applicability of DNA typing results in databases (Huffman and Ballantyne, 2023).
In addition to the aforementioned challenges, the realm of complex kinship analysis demands attention as well.Complex kinship relationships involve grandparent-grandchild, uncle/auntnephew/niece, full sibling, half sibling, and first or second cousins.Identifying these relationships accurately can be particularly difficult due to their intricate nature.To enhance the accuracy of identification in complex kinship analysis, forensics may add STR, SNP and InDel genetic markers or adopt novel genetic markers (Zhang et al., 2022).The emergence of FGG has also significantly promoted distant relative inference (Glynn, 2022).However, even with advancements in technology and methods, kinship analysis of identical twins still remains notably challenging (Yuan et al., 2020).Currently, it is promising to identify monozygotic twins by ultra-deep next-generation sequencing to identify rare mutations (Weber-Lehmann et al., 2014), Various CpG sites (Mill et al., 2006;van Dongen et al., 2021) and microbial communities (Fierer et al., 2010;Martinez et al., 2013).In conclusion, the future research trend will involve the identification of novel genetic markers and the development of advanced analytical techniques.Challenges that still exist in this field include accurately and rapidly analyzing complex kinship, as well as successfully typing degraded DNA or mixed samples.

Conclusion
This study primarily focuses on the global trends and development of kinship analysis.From an overall perspective, research in the field of kinship analysis has gradually gained attention, with an increasing number of papers published over the years.In terms of countries, this relevant research is mainly driven by developed and developing nations such as China, the United States, and Germany.Looking ahead, there is a desire to enhance international exchanges and involve more countries in kinship analysis research.Simultaneously, research in the field of kinship analysis concentrates on the identification of novel genetic markers and development of advanced analytical techniques.However, numerous challenges still exist within this domain.

Five
distinct clusters: Cluster 1: Australia, Canada, China, Finland, Germany, Israel, Japan, Pakistan, United Kingdom, United States.Cluster 2: Belgium, France, Netherlands, Norway, Poland, Russia, Sweden, Switzerland, Turkey.Cluster 3: Argentina, Brazil, Colombia, Mexico Portugal, Spain.Cluster 4: India, Italy, South Korea.Cluster 5: Austria, Denmark.The research in the field of kinship analysis in the United States started earlier, while that in China started late (Figure5B).The Sichuan University ranked first with 40 articles, followed by the University of Porto (34), the Southern Medical University (30), and the Sun Yat-Sen University (30) (Figure6; Table2).The University of Cologne has the highest mean article citations, with 14 articles and an average of 77.93 citations per article, followed by the University of Leipzig (20 articles, average 63.25 citations) and Technische Universität Dresden (15 articles, average 53.00 citations).Total Link Strength (TLS) can reflect collaborative research between institutions.The Sichuan University has the highest TLS score with 53 points.Norwegian University of Life Sciences and the University of Proto ranked second and third with 52 and 49 points respectively.The TLS scores, combined with the national cooperation network map (Figure6), indicated that the Sichuan University, University of Porto, the Southern Medical University, the Sun Yat-Sen

FIGURE 5
FIGURE 5Co-occurrence analysis of countries.(A) Collaborative relationships among various countries in kinship analysis were visualized using VOSviewer.The visualization displays a network of diverse countries.Each node corresponds to an individual country, and the size of each circle is determined by the quantity of publications.The connecting lines symbolize collaborations among countries.The lines connecting items depict links, and the distance between two items roughly indicates their level of relatedness.Various colors denote distinct items.(B) Collaborative relationships among various countries in the field of kinship analysis were visualized using VOSviewer.Illustrate the distribution of countries based on the average timing of their contributions.Green and blue circles represent earlier publications, while yellow circles denote more recent ones.

Figure 8
Figure 8 reveals the four major journals with the fastest growth in publications over the past decade: Forensic Science International-Genetics, International Journal of Legal Medicine, Forensic Science International and Journal of Forensic Sciences.

FIGURE 7
FIGURE 7 Contribution of top 10 authors over different years (red lines).The size of dots indicates the number of publications over different years, and the color of dots (light to dark) indicates total citations (TC) per year.

FIGURE 8
FIGURE 8Yearly publication growth trend of top 10 sources in the field of kinship identification with the highest number of documents.
(A) The co-occurrence network of keywords is depicted.The lines connecting nodes indicate co-occurrence among distinct keywords.Distinct colors in the figure signify clusters, each comprising closely related nodes or items.Each network item is assigned to a single cluster, with an item's color determined by its cluster membership.Connecting lines between items represent links, and the distance between two items roughly indicates their relatedness.(B) Overlay Visualization illustrating keywords.Display the keywords based on their average timing of occurrence.Green and blue circles represent earlier publications, while yellow circles denote more recent ones.

TABLE 1
Top 20 countries in the field of kinship analysis with the highest number of articles.

TABLE 2
Top 20 organizations in the field of kinship analysis with the highest number of documents.The total link strength in the table indicates the total strength of the co-authorship links of a given organization with other organizations.

TABLE 3
Top 20 authors in the field of kinship identification with the highest number of H-index.TC = Total citations; NP = Number of productions; PY start = Publication years start.

TABLE 4
Top 20 sources in the field of kinship identification with the highest number of H-index.TC = Total citations; NP = Number of productions; PY start = Publication years start.

TABLE 5
Top 20 articles in the field of kinship identification with the highest number of citations.TC = Total citations.